Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable MLP Fused Ops if Not SwiGLU, Depracted Fast Quantized Peft Plugin, Update Benchmarks #106

Merged
merged 7 commits into from
Nov 14, 2024

Conversation

fabianlim
Copy link
Contributor

@fabianlim fabianlim commented Nov 8, 2024

This PR

  • disables the MLP Fused Ops if the activation function is not SwiGLU. In this implementation, the rules are generated upfront before checking what model is activated.
  • fast_quantized_peft is now removed to prevent further confusion
  • Update the benchmarks with all the new updates.

Updated Benchmarks

Outliers
image

Generally we noticed two things

image image
image image

outliers.csv

Base automatically changed from fix/lora-drop to main November 8, 2024 11:00
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
@fabianlim fabianlim changed the title Disable MLP Fused Ops if Not SwiGLU and removed Depracted Fast Quantized Peft Plugin Disable MLP Fused Ops if Not SwiGLU, Depracted Fast Quantized Peft Plugin, Update Benchmarks Nov 10, 2024
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
@fabianlim fabianlim merged commit 9239802 into main Nov 14, 2024
7 checks passed
@fabianlim fabianlim deleted the fix/swiglu branch November 14, 2024 01:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant